Weighted single-step genome-wide association study for direct and maternal genetic effects associated with birth and weaning weights in sheep

Body weight is an important economic trait for sheep meat production, and its genetic improvement is considered one of the main goals in the sheep breeding program. Identifying genomic regions that are associated with growth-related traits accelerates the process of animal breeding through marker-assisted selection, which leads to increased response to selection. In this study, we conducted a weighted single-step genome-wide association study (WssGWAS) to identify potential candidate genes for direct and maternal genetic effects associated with birth weight (BW) and weaning weight (WW) in Baluchi sheep. The data used in this research included 13,408 birth and 13,170 weaning records collected at Abbas-Abad Baluchi Sheep Breeding Station, Mashhad-Iran. Genotypic data of 94 lambs genotyped by Illumina 50K SNP BeadChip for 54,241 markers were used. The proportion of variance explained by genomic windows was calculated by summing the variance of SNPs within 1 megabase (Mb). The top 10 window genomic regions explaining the highest percentages of additive and maternal genetic variances were selected as candidate window genomic regions associated with body weights. Our findings showed that for BW, the top-ranked genomic regions (1 Mb windows) explained 4.30 and 4.92% of the direct additive and maternal genetic variances, respectively. The direct additive genetic variance explained by the genomic window regions varied from 0.31 on chromosome 1 to 0.59 on chromosome 8. The highest (0.84%) and lowest (0.32%) maternal genetic variances were explained by genomic windows on chromosome 10 and 17, respectively. For WW, the top 10 genomic regions explained 6.38 and 5.76% of the direct additive and maternal genetic variances, respectively. The highest and lowest contribution of direct additive genetic variances were 1.37% and 0.42%, respectively, both explained by genomic regions on chromosome 2. For maternal effects on WW, the highest (1.38%) and lowest (0.41%) genetic variances were explained by genomic windows on chromosome 2. Further investigation of these regions identified several possible candidate genes associated with body weight. Gene ontology analysis using the DAVID database identified several functional terms, such as translation repressor activity, nucleic acid binding, dehydroascorbic acid transporter activity, growth factor activity and SH2 domain binding.

genetic potential but also by the maternal environment to provide an appropriate environment in the form of suitable feeding.The environmental component may be divided into permanent and temporary environmental parts.It is strongly recommended that the direct additive, maternal additive, and maternal permanent environmental effects should be included in animal models when evaluating the genetic merit to obtain accurate estimates of variance-covariance components 2 .
With the rapid advancement of genome sequencing technologies and the advent of dense marker BeadChips, it is now possible to determine the genotype of tens of thousands of markers and to conduct association studies between genetic markers and phenotypic records.These types of studies, referred to as genome-wide association studies (GWAS), have become a common approach to precisely identify quantitative trait loci (QTL) 3 .
In several previous GWAS related to growth traits in sheep, associated genetic markers or candidate genes for body weight have been reported [4][5][6][7][8][9] .However, the genetic basis of sheep breeds or populations, the varied number of samples used for GWAS, and the different densities of genetic marker panels led to different reports for previous studies.
Among the various approaches proposed for GWAS, the weighted single-step GWAS (WssGWAS) 10 is recently used in populations where a large number of individuals are phenotyped over several generations, but genotyped for a smaller number of individuals who belong usually to the more recent generations.In this method, using phenotypes, genotypes, and a combination of a pedigree-derived relationship matrix and genomic relationship matrix (H matrix), genome-wide estimated breeding values (GEBV) are obtained from a genome-wide single-step best linear unbiased prediction (ssGBLUP, 11 ).In the next step, based on the equivalence between marker effect models and breeding value models, GEBVs are converted to marker effects, and SNP effects are consequently estimated.One of the attractive features of WssGWAS is that it is possible to consider unequal variances for SNPs which leads to improved accuracy in estimating SNP effects 10 .In livestock populations, with the small number of phenotypes and genotypes, for which QTLs with large effects play an important role in phenotypic variance of target traits, WssGWAS may perform better than traditional GWAS methods 12 .This method has recently been used to identify genomic regions affecting production and reproductive traits in livestock [13][14][15] .In the present study, we implemented WssGWAS to identify genomic regions and related candidate genes associated with birth and weaning weights in Baluchi sheep.To understand the genetic basis of birth and weaning weights in sheep, we also performed a post GWAS function analyses including GO and KEGG pathway analyses to identify the biological processes and functional terms significantly enriched with identified candidate genes.

Phenotypes, genotypes, and pedigree
The phenotypic data used in this research included 13,408 birth weight and 13,170 weaning weight data of Baluchi lambs that were collected at Abbas-Abad Baluchi Sheep Breeding Station located in Mashhad-Iran.The pedigree file encompassed 22,160 animals including 518 sires and 6,283 dams.Details of management and feeding for Baluchi sheep were previously reported by Gholizadeh et al. 16 .The birth weights of lambs were recorded after birth.Subsequently, the lambs were raised until 90 days of age to determine the waning weight of each lamb.Characteristics of the data structure are summarized in Supplementary Table 1.
A total of 94 lambs were genotyped by Illumina 50K SNP BeadChip for 54,241 markers.Genotype data were provided by the animal genetics group of Sari Agriculture Science and Natural Resource University-Iran 17 .SNPs with unknown positions on the Ovine genome, and SNPs located on sex chromosomes were initially excluded from the data.Further quality control for Autosome SNPs was performed by Plink 18 using the following criteria: individual call rate ≥ 95%; SNP call rate ≥ 95%; minor allele frequency ≥ 0.05; strong Hardy-Weinberg equilibrium violation p-value ≥ 10 −6 .After quality control, 94 lambs, and 41,102 SNPs remained for further analysis.For WssGWAS analysis, missing genotypes were imputed using Beagle software version 4.1 19 .

Statistical model
First, weaning weights were adjusted to 90 days of age as follows, where BW and WW are the birth weight and weaning weight of lambs, respectively.The univariate animal model including maternal effects used for ssGBLUP was as follows: where y is the vector of observations; b is the vector of significant fixed effects (sex, birth type, herd, birth year, and age of mother at lambing); a ∼ N(0, Hσ 2 a ) is the vector of direct additive genetic effects; p ∼ N(0, Iσ 2 p ) is the vector of maternal permanent environmental effects; m ∼ N(0, Hσ 2 m ) is the vector of maternal genetic effects; e ∼ N(0, Iσ 2 e ) is the vector of random residuals; where σ 2 a , σ 2 p , σ 2 m , and σ 2 e are the additive genetic, maternal permanent environmental, maternal genetic and residual variances, respectively.X, Z, W and S are the incidence matrices of b, a, p, and m, respectively.
H is the hybrid matrix that combines pedigree and genomic information 11,20 , and I is an identity matrix.The inverse of the H matrix was calculated as follows, unadjusted WW − BW lamb age day at weaning × 90 where A is the numerator relationship matrix based on pedigree data for all individuals; A 22 is the numerator relationship matrix corresponding to the genotyped individuals; G ω is the weighted genomic relationship matrix which was obtained using G ω = αG + βA 22 where α and β are weighting factors, selected to be 0.95 and 0.05, respectively.These weights were used to make G positive-definite, improve convergence 21,22 , scale the genomic information to be compatible with the pedigree information and to control bias 23,24 ; G is the genomic relationship matrix 21 , obtained as follows: where Z is a matrix of gene content adjusted for allele frequencies (with elements of 0-2p, 1-2p, and 2-2p representing genotypes AA, Aa, and aa, respectively; p is the minor allele frequency (MAF)), D represents a diagonal matrix carried the weights of SNP, p i is the MAF of the i th SNP and m implies the number of SNPs.
Variance components and heritability of the studied traits were estimated using average information restricted maximum likelihood (AI-REML) method 25 via pedigree data.Marker effects and further weights required for constructing G matrix were calculated in an iterative way proposed by Wang et al. 10,26 .An iteration method with the steps described below was used for the association study.

GEBV calculation,
In this step, GEBVs were calculated for the entire population using the ssGBLUP approach, considering: 3. Calculation of marker effects, Marker effects ( u ) were obtained via GEBV conversion as,û(t) = D (t) Z′G −1 (t) a g where a g is the GEBV of individuals that were genotyped.4. Calculation of SNP weights, SNP weights for the next iteration were obtained using the Non linearA approach as follows, where CT is a constant, (1.125, 21 ), that determines the departure from normality; a i is the estimated effect for marker j, and sd( a) is the standard deviation of the estimated effects of SNP.The NonlinearA approach avoids extreme values for weights and provides good convergence properties 27 .This is achieved as the maximum variation in weights is limited by the minimum between 5 and the exponent of CT 28 .5. The normalization of SNP weights to keep the total genetic variance constant as follows: 6. Weighted G calculation ( G (t+1 ) as follows, 7. t = t + 1 and loop to step 2.
The SNP effects were estimated by 3 iterations.The percentage of genetic variance explained by the i th SNP window was calculated as follows: where ai is the genetic value of the ith SNP window that consists of a region of consecutive SNPs; σ 2 a is the total additive genetic variance; Z j is the vector of the gene content of the jth SNP for all individuals; and u j is the effect of the jth SNP within the ith window.
The iteration steps above were conducted via the BLUPF90 software family 23 for genomic analyses 29 .Initially, variance components were estimated using AIREMLF90 under a univariate animal model considering maternal effects.Estimated variance components then used in BLUPF90 for GEBVs prediction.PostGSf90 software was then used for SNP effects calculation.The proportion of variance explained by non-overlapping windows was estimated by summing the variance of SNPs within 1 megabase (Mb) 30,31 .SNP-density plot created by the CMplot package 32 representing the number of quality-passed SNPs within each 1Mbp window size is given in Fig. 1.The Top 10 window regions explaining the highest percentages of additive and maternal genetic variance were selected as candidate QTL regions associated with body weights as used in previous studies in animal

Birth weight
The top 10 significant SNP windows that contributed to direct and maternal genetic effects for birth weight along with the identified potential candidate genes are listed in Table 1.The top 10 SNP windows explained a total of 4.30% of direct additive genetic variance.The additive genetic variance explained by the genomic window regions varied from 0.59 on chromosome 8 (37,942,516-38,934,945) to 0.31 on chromosome 1 (214,063,075-215,052,580) (Supplementary Tables 2-4).Further investigation of these regions on the NCBI Genome Data Viewer identified many genes in these areas.
For maternal effects on birth weight, a total of 4.62% of maternal genetic variance was explained by the top 10 SNP windows.The highest maternal genetic variance (0.84%) was explained by a genomic window on chromosome 10 (41,802,553-42,753,634), and the lowest maternal genetic variance (0.32%) was explained by a genomic window on chromosome 17 (33,670,501-34,661,256).In these genomic window regions, several genes for maternal effects were identified (Table 1).Circle Manhattan plot from WssGWAS for direct and maternal genetic effects on birth weight is presented in Fig. 2. Results of gene ontology and KEGG signaling pathway analysis of candidate genes are presented in Table 2 Weaning weight Genomic windows identified for direct genetic and maternal effects that were significantly associated with weaning weight are listed in Table 3.For direct additive genetic effects, the top 10 genomic windows explained a total of 6.4% of the genetic variance.The contribution of explained genetic variance by genomic windows varied from 1.37% on chromosome 2 (235,006,576-235,981,086) to 0.42%, again on chromosome 2 (203,829,884-204,804,430) (Supplementary Tables 5-7).Chromosome 2 with 4 genomic window regions showed the most contribution of additive genetic variance.Several genes were identified for these genomic windows.For instance, the most significant window included LAPTM5, MATN1, SERINC2, FABP3, ZCCHC17, TRNA-CUG, SNRNP40, NKAIN1, PUM1 and SDC3 genes (Table3).
For maternal effects on WW, the highest and lowest maternal genetic variances were explained by genomic windows on chromosome 2.The top 10 window genomic regions explained 5.76% of the maternal genetic variation.The identified genes for the top genomic windows are listed in Table 3. Circle Manhattan plot from WssGWAS for direct and maternal genetic effects on WW is presented in Fig. 3. Significant GO terms and KEGG signaling pathways of candidate genes related to weaning weight are presented in Table 4.

Discussion
In the present study, we identified the top 10 genomic windows that accounted for the highest explained genomic variance as candidate genomic regions for each trait.Studies have shown that the use of sliding windows for simultaneous analysis of multiple SNPs in ssGWAS is able to reduce the noise due to the estimation process 36 .
The ssGWAS method combines all genotype, phenotype and pedigree data in single step 10 .Compared with single-marker regression GWAS, this method can use all markers simultaneously, resulting in greater power and precise estimate values 36 .Moreover, ssGWAS utilizes a large amount of phenotypic data of ungenotyped individuals which results in increasing the sample size to a certain extent, improving the accuracy of SNP effect estimation, and increasing the efficiency of SNP identification 37 .However, the ssGWAS model, considers equal variance for all marker effects and therefore may be constrained when traits are influenced by large QTL 10 .To overcome this challenge, the WssGWAS approach was proposed, in which SNP effects are weighted according to their importance in genetic variance of the trait, which improves the precision of QTL detection 10 .Various studies have shown that growth traits such as body weight at birth and weaning in sheep are affected by direct genetic and maternal effects 16,38 .It has been proven that considering maternal effects in genetic evaluation models leads to a more accurate estimation of (co)variance, genetic parameters and, genetic evaluation of these traits 39 .By using the ssGWAS method, phenotypic and pedigree information of genotyped and nongenotyped animals together with the hybrid relationship matrix leads to a more reliable QTL detection and reducing the probability of spurious signals 40 .In this study, we performed WssGWAS to identify genomic regions and associated candidate genes responsible for genetic variation in BW and WW in Baluchi sheep.The identified genomic regions were associated with 108 and 97 possible candidate genes for BW and WW, respectively among which some of them were previously reported as candidate genes involved in body weight.Promising candidate genes found included AHCYL2, HILPDA, KCP, LEP, FZD9, MDH2, CLDN4, OSBPL2EIF5A2, RPL22L1, RPL22L1, TNIK, CYSTM1, DNAJC18, ECSCR and BBS12 for BW; and FABP3, MATN1, SDC3, FNDC5, PIKfyve, CCSER1, MMRN1, PDCL3, RRP15, TGFB2, RAPH1, ACSL6 and CSF2 for WW.
AHCYL2 gene has been reported to be associated with the growth, carcass traits, and the meat quality of sheep 41 .AHCYL2 gene has been reported as the main gene affecting the backfat thickness of beef cattle 42 .It has been documented that body weight in sheep could be controlled through up-regulation of AHCYL2 gene expression by regulating fat content and muscle development 43 .HILPDA is a small, lipid-droplet-associated protein expressed in several tissues that elevate lipid storage in hepatocytes, adipocytes, and macrophages 44 through directly binding and inhibiting adipose triglyceride lipase 45 .The results of studies on muscle proteomic profiles show that body weight in sheep can be influenced by fat deposition 43 .KCP is a secreted protein that regulates the expression and function of BMPs 46 which are multi-functional growth factors that belong to the transforming growth factor β (TGFβ) superfamily.The roles of BMPs in embryonic development have been extensively reported 47 .Leptin is principally produced in adipose tissue and is engaged in the regulation of body homeostasis, energy intake, storage and expenditure, fertility, and immune functions 48 .Haplotypes of LEP genes have been confirmed to be associated with body traits in sheep 49 .Leptin, with respect to its role in the regulation of energy metabolism, has been involved in the harmonious adjustment of maternal adaptations during pregnancy and lactation 50 .It has been reported that human breast milk leptin during the first stages of lactation may regulate processes that affect infant weight gain from the first to the sixth month 51 .FZD9 has been reported to be associated with lipid metabolism during lactation, quantity of milk produced 52 , pregnancy and, fertility 53 which can support the role of this gene in the weight of lambs at birth.It has been reported that the MDH2 gene is less expressed in leaner pig breeds, and also, in general, this gene is expressed at higher levels in females than in males 54 , which could be concluded that this gene is related to body weight.It has been reported that the growth, health and, flexibility of low birth-weight pigs are compromised due to imperfect intestinal development.CLDN4 is a gene involved in the function of the intestinal barrier, which has a lower expression in pigs with low birth weight, which, along with the delay in the development of intestinal villi and crypts, could be a factor for the growth potential in female piglets with low birth weight 55 .OSBPL2 is one of the lipid transfer proteins that regulate intracellular cholesterol homeostasis and lipid droplet lipolysis in a way that OSBPL2 deficiency enlarges intracellular lipid droplets which may lead to obesity 56 .It has been reported that overexpression of EIF5A2 results in decreased growth rate and body weight in adult transgenic mice 57 .Lower expression of the RPL22L1 gene, involved in protein synthesis, has been confirmed to be associated with compromised bone formation and growth in the lambs with the most efficient body weight gain 58 .RPL22L1 is also reported as a candidate gene for birth weight in Holstein Friesian 59,60 .It has been demonstrated that TNIK has a critical role in the regulation of energy balance, glucose and fatty acid metabolism, lipid deposition, and insulin sensitivity, and TNIK knockout mice show a leaner phenotype.Studies demonstrated that CYSTM1 is significantly associated with the gestation length 61,62 .It has been confirmed that the gestation length is genetically in positive correlation with birth weight 63 .DNAJC18 and ECSCR contributed to maternal genetic effects for birth weight are associated with heat stress in cattle 64,65 .Heat stress during late gestation has been reported to reduce calf birth weight, which last significantly until 7 days of age (Trifkovic 2018).The BBS12 gene has been reported as a candidate gene for pure meat weight, foreshank weight, and silverside weight in beef cattle 66 .
FABP3 genes contributed to both direct additive and maternal genetic effects for weaning weight.FABP3 is involved in fatty acid transport from the cell membrane to the intracellular sites of fatty acid utilization and is mainly expressed in cardiac and skeletal muscle 67 .Clvo et al. 68 reported a linkage disequilibrium between the FABP3 gene and a quantitative trait locus for milk fat contents in Manchega breed sheep.It has been reported that the composition of fatty acids in maternal milk fat has a positive effect on the live weight of lambs and the optimization of the growth potential of lambs until weaning 69 .Fatty acids that are produced in the liver and muscles have special functions in the body, and form the primary sources of energy stored in triglycerides, and participate in the formation of complex lipids, hormones and signaling compounds 70 .Cahyadi 71 reported that FABP3 was significantly associated with body weights at birth, at 12 to 20 weeks, and slaughter in Korean Native Chickens.Kuzminska 72 found significant associations between the FABP3 genotypes and the body weight at 210 days of age in Simmental cows.It was reported that there was a significant association between FABP3 gene variants and body weight and hock weight in swine 73 .The genetic polymorphisms of FABP3 were associated with early and late-stage stage body weights in the Landrace × Jeju black pig population 74 .It has been reported that the MATN1 gene plays an important role in muscle growth at different stages of development.The lower expression of this gene is associated with a decrease in muscle growth in low-birth-weight fetuses 75 .Feeding behavior and body weight are regulated by the SDC3 gene 76 .Srinivasa et al. 77 reported that expression of FNDC5 in skeletal muscle was associated with mRNA expression of IGF-I in obese individuals with reduced growth hormone showing a possible function for FNDC5 in skeletal muscle under a low growth hormone state.It was reported that the PIKfyve pathway is essential in mammary epithelial differentiation during pregnancy 78 which supports its role in the maternal ability to grow lambs and their weaning weight.It has been reported that CCSER1 is associated with feed efficiency in beef cattle 79 , growth and feed intake in sheep 80 , knuckle, biceps, and shank of beef www.nature.com/scientificreports/carcass traits 81 .MMRN1 has been confirmed to be associated with feed efficiency, growth, and carcass traits in beef cattle 82 .It has been proved that NMS is involved in suckling-induced oxytocin release which is essential for milk ejection in mammals 83 .Lack of oxytocin or its receptor in mice results in deficient milk ejection response to the suckling stimulus 84 .It has been documented that PDCL3 variants are significantly associated with weight at 12 months of age, carcass weight, and loin eye area in an indigenous Korean cattle breed 85 .PDCL3 encodes a chaperone protein which through interacting with vascular endothelial growth factor receptor 2 is involved in angiogenesis 86 .RRP15 has been reported as a candidate gene for milk yield in dairy sheep which is expressed in either the milk transcriptome or the mammary gland 87 .TGFB2 gene is one of the members of the TGFB superfamily, which plays important roles in embryogenesis, cell differentiation, muscle development, growth, and reproductive regulation 88,89 .The RAPH1 gene has been confirmed to be associated with cell migration and nutrient absorption by rumen epithelia cells 90 .ACSL6 has been previously reported as a candidate gene associated with dry matter intake and mid-test body weight in the Angus population 91 .In mammals, ACSL gene is essential for fatty acid degradation, remodeling of the phospholipid, and the long-chain acyl-CoA esters production that regulate different kinds of physiological, metabolism, and cell signaling processes 92 .CSF2 has been reported as one of the regulatory molecules that mediate maternal effects on embryonic development during the preimplantation period and enhance embryo competence for post-transfer survival 93 .

Conclusions
In the present study, we conducted a WssGWAS and identified many genomic regions and known Ovine candidate genes for birth weight and weaning weight in sheep which can provide new insights into the genetic basis of growth traits in sheep.These findings can be further investigated to search for causative mutations underlying body weights in lambs and for marker-assisted selection to improve the meat production in sheep.The genetic architecture of growth traits is very complex and these candidate genes may be under breed-specific effects.Therefore, it is recommended to study the genes identified in this study as candidate genes in different breeds of sheep under different environmental conditions.This leads to a better understanding of the complex pathways underlying growth traits in sheep.One limitation of our study is the sample size of genotyped individuals which is not large.Therefore, further high-resolution researches utilizing larger sample sizes may help validate the genomic regions and candidate genes associated with body weight in sheep identified in the present study.

Figure 1 .
Figure 1.The SNP-density plot representing the number of quality-passed SNPs within each 1Mbp window size.

Figure 2 .
Figure 2. The circular Manhattan plot from the WssGWAS for the direct additive and maternal genetic effects on birth weight explained by SNPs.The 2 circles from inside to outside occupy the maternal and direct additive genetic variance explained by SNPs (%), respectively.The outermost circle shows the SNP density in the 1 Mb window for each chromosome.

Figure 3 .
Figure 3.The circular Manhattan plot from the WssGWAS for the direct additive and maternal genetic effects on weaning weight explained by SNPs.The 2 circles from inside to outside occupy the maternal and direct additive genetic variance, respectively.The outermost circle shows the SNP density in the 1 Mb window for each chromosome. https://doi.org/10.1038/s41598-024-63974-0www.nature.com/scientificreports/ https://doi.org/10.1038/s41598-024-63974-0

Table 1 .
The top 10 significant genomic windows and candidate genes contributing to the direct additive and maternal genetic effects for birth weight in the Baluchi sheep.

Table 2 .
Significant GO terms and KEGG signaling pathways of candidate genes related to birth weight in Baluchi sheep.

Table 3 .
The top 10 significant genomic windows and candidate genes contributing to the direct additive and maternal genetic effects for weaning weight in the Baluchi sheep.

Table 4 .
Significant GO terms and KEGG signaling pathways of candidate genes related to weaning weight in Baluchi sheep.